syntactic representation
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.90)
- Health & Medicine > Diagnostic Medicine > Imaging (0.88)
Understanding Syntactic Generalization in Structure-inducing Language Models
Arps, David, Sajjad, Hassan, Kallmeyer, Laura
Structure-inducing Language Models (SiLM) are trained on a self-supervised language modeling task, and induce a hierarchical sentence representation as a byproduct when processing an input. SiLMs couple strong syntactic generalization behavior with competitive performance on various NLP tasks, but many of their basic properties are yet underexplored. In this work, we train three different SiLM architectures from scratch: Structformer (Shen et al., 2021), UDGN (Shen et al., 2022), and GPST (Hu et al., 2024b). We train these architectures on both natural language (English, German, and Chinese) corpora and synthetic bracketing expressions. The models are then evaluated with respect to (i) properties of the induced syntactic representations (ii) performance on grammaticality judgment tasks, and (iii) training dynamics. We find that none of the three architectures dominates across all evaluation metrics. However, there are significant differences, in particular with respect to the induced syntactic representations. The Generative Pretrained Structured Transformer (GPST; Hu et al. 2024) performs most consistently across evaluation settings, and outperforms the other models on long-distance dependencies in bracketing expressions. Furthermore, our study shows that small models trained on large amounts of synthetic data provide a useful testbed for evaluating basic model properties.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > Austria > Vienna (0.14)
- (25 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (4 more...)
A Systematic Comparison of Syntactic Representations of Dependency Parsing
Wisniewski, Guillaume, Lacroix, Ophélie
We compare the performance of a transition-based parser in regards to different annotation schemes. We pro-pose to convert some specific syntactic constructions observed in the universal dependency treebanks into a so-called more standard representation and to evaluate parsing performances over all the languages of the project. We show that the ``standard'' constructions do not lead systematically to better parsing performance and that the scores vary considerably according to the languages.
- Europe > Czechia > Prague (0.05)
- Europe > Sweden > Uppsala County > Uppsala (0.05)
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- (9 more...)
Finding Structure in Language Models
When we speak, write or listen, we continuously make predictions based on our knowledge of a language's grammar. Remarkably, children acquire this grammatical knowledge within just a few years, enabling them to understand and generalise to novel constructions that have never been uttered before. Language models are powerful tools that create representations of language by incrementally predicting the next word in a sentence, and they have had a tremendous societal impact in recent years. The central research question of this thesis is whether these models possess a deep understanding of grammatical structure similar to that of humans. This question lies at the intersection of natural language processing, linguistics, and interpretability. To address it, we will develop novel interpretability techniques that enhance our understanding of the complex nature of large-scale language models. We approach our research question from three directions. First, we explore the presence of abstract linguistic information through structural priming, a key paradigm in psycholinguistics for uncovering grammatical structure in human language processing. Next, we examine various linguistic phenomena, such as adjective order and negative polarity items, and connect a model's comprehension of these phenomena to the data distribution on which it was trained. Finally, we introduce a controlled testbed for studying hierarchical structure in language models using various synthetic languages of increasing complexity and examine the role of feature interactions in modelling this structure. Our findings offer a detailed account of the grammatical knowledge embedded in language model representations and provide several directions for investigating fundamental linguistic questions using computational methods.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Australia > New South Wales > Sydney (0.13)
- Europe > Austria > Vienna (0.13)
- (63 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.54)
- Leisure & Entertainment (1.00)
- Education (1.00)
- Information Technology (0.92)
- (5 more...)
Interpretable Syntactic Representations Enable Hierarchical Word Vectors
The distributed representations currently used are dense and uninterpretable, leading to interpretations that themselves are relative, overcomplete, and hard to interpret. We propose a method that transforms these word vectors into reduced syntactic representations. The resulting representations are compact and interpretable allowing better visualization and comparison of the word vectors and we successively demonstrate that the drawn interpretations are in line with human judgment. The syntactic representations are then used to create hierarchical word vectors using an incremental learning approach similar to the hierarchical aspect of human learning. As these representations are drawn from pre-trained vectors, the generation process and learning approach are computationally efficient. Most importantly, we find out that syntactic representations provide a plausible interpretation of the vectors and subsequent hierarchical vectors outperform the original vectors in benchmark tests. Distributed representation of words present words as dense vectors in a continuous vector space.
- Asia > Middle East > Israel (0.04)
- North America > United States > Virginia (0.04)
- North America > United States > Oregon (0.04)
- (4 more...)
Inference over Unseen Entities, Relations and Literals on Knowledge Graphs
Demir, Caglar, Kouagou, N'Dah Jean, Sharma, Arnab, Ngomo, Axel-Cyrille Ngonga
In recent years, knowledge graph embedding models have been successfully applied in the transductive setting to tackle various challenging tasks including link prediction, and query answering. Yet, the transductive setting does not allow for reasoning over unseen entities, relations, let alone numerical or non-numerical literals. Although increasing efforts are put into exploring inductive scenarios, inference over unseen entities, relations, and literals has yet to come. This limitation prohibits the existing methods from handling real-world dynamic knowledge graphs involving heterogeneous information about the world. Here, we propose a remedy to this limitation. We propose the attentive byte-pair encoding layer (BytE) to construct a triple embedding from a sequence of byte-pair encoded subword units of entities and relations. Compared to the conventional setting, BytE leads to massive feature reuse via weight tying, since it forces a knowledge graph embedding model to learn embeddings for subword units instead of entities and relations directly. Consequently, the size of the embedding matrices are not anymore bound to the unique number of entities and relations of a knowledge graph. Experimental results show that BytE improves the link prediction performance of 4 knowledge graph embedding models on datasets where the syntactic representations of triples are semantically meaningful. However, benefits of training a knowledge graph embedding model with BytE dissipate on knowledge graphs where entities and relations are represented with plain numbers or URIs. We provide an open source implementation of BytE to foster reproducible research.
- Europe > Western Europe (0.05)
- Europe > Germany > North Rhine-Westphalia (0.04)
- Africa > Ethiopia (0.04)
PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling
Hwang, Ji-Sang, Lee, Sang-Hoon, Lee, Seong-Whan
Although text-to-speech (TTS) systems have significantly improved, most TTS systems still have limitations in synthesizing speech with appropriate phrasing. For natural speech synthesis, it is important to synthesize the speech with a phrasing structure that groups words into phrases based on semantic information. In this paper, we propose PuaseSpeech, a speech synthesis system with a pre-trained language model and pause-based prosody modeling. First, we introduce a phrasing structure encoder that utilizes a context representation from the pre-trained language model. In the phrasing structure encoder, we extract a speaker-dependent syntactic representation from the context representation and then predict a pause sequence that separates the input text into phrases. Furthermore, we introduce a pause-based word encoder to model word-level prosody based on pause sequence. Experimental results show PauseSpeech outperforms previous models in terms of naturalness. Furthermore, in terms of objective evaluations, we can observe that our proposed methods help the model decrease the distance between ground-truth and synthesized speech.
- North America > Canada > Quebec > Montreal (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
Jafariakinabad
Writing style is a combination of consistent decisions at different levels of language production including lexical, syntactic, and structural associated to a specific author (or author groups). While lexical-based models have been widely explored in style-based text classification, relying on content makes the model less scalable when dealing with heterogeneous data comprised of various topics. On the other hand, syntactic models which are content-independent, are more robust against topic variance. In this paper, we introduce a syntactic recurrent neural network to encode the syntactic patterns of a document in a hierarchical structure. The model first learns the syntactic representation of sentences from the sequence of part-of-speech tags. Subsequently, the syntactic representations of sentences are aggregated into document representation using recurrent neural networks.
Disentangling Syntax and Semantics in the Brain with Deep Networks
Caucheteux, Charlotte, Gramfort, Alexandre, King, Jean-Remi
The activations of language transformers like GPT-2 have been shown to linearly map onto brain activity during speech comprehension. However, the nature of these activations remains largely unknown and presumably conflate distinct linguistic classes. Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four combinatorial classes: lexical, compositional, syntactic, and semantic representations. We then introduce a statistical method to decompose, through the lens of GPT-2's activations, the brain activity of 345 subjects recorded with functional magnetic resonance imaging (fMRI) during the listening of ~4.6 hours of narrated text. The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices. Second, contrary to previous claims, syntax and semantics are not associated with separated modules, but, instead, appear to share a common and distributed neural substrate. Overall, this study introduces a versatile framework to isolate, in the brain activity, the distributed representations of linguistic constructs.
- North America > United States (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)